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Abstract. The behavior of complex networks under failure or attack depends strongly on the specific 
scenario. Of special interest are scale- free networks, which are usually seen as robust under random failure 
but appear to be especially vulnerable to targeted attacks. In recent studies of public transport networks of 
fourteen major cities of the world it was shown that these systems when represented by appropriate graphs 
may exhibit scale-free behavior [C. von Ferber et al, Physica A 380, 585 (2007), Eur. Phys. J. B 68, 261 
(2009)]. Our present analysis, focuses on the effects that defunct or removed nodes have on the properties 
of public transport networks. Simulating different directed attack strategies, we derive vulnerability criteria 
that result in minimal strategies with high impact on these systems. 
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1 Introduction 



The question of resilience or vulnerability of a complex 
network against failure of its parts has, beside purely 
academic interest a whole range of important practical im- 
plications. In what follows below any such failure will be 
called an attack. In practice, the origin of the attack and 
its scenario may differ to large extent, ranging from ran- 
dom failure, when a node or a link in a network is removed 
at random to a targeted destruction, when the most in- 
fluential network constituents are removed according to 
their operating characteristics. The notion of attack vul- 
nerability of complex networks originates from studies of 
computer networks and was coined to denote the decrease 
of network performance as caused by the removal of either 
nodes or links. The behavior of a complex network under 
attack has been observed to drastically differ from that of 
regular lattices. Early evidence of this fact was found in 
particular for real world networks that show scale-free be- 
havior: the world wide web and the internet [2|3j . as well 
as metabolic [4,, food web [S], and protein [B] networks. It 
appeared that these networks display a high degree of ro- 
bustness against random failure. However, if the scenario 



is changed towards targeted attacks, the same networks 
may appear to be especially vulnerable [7|8j . 

Essential progress towards a theoretical description of 
the attack vulnerability of complex networks is due to the 
application of the tools and concepts of percolation phe- 
nomena [9]. On a lattice percolation occurs e.g. when at a 
given concentration of bonds a spanning cluster appears. 
This concentration Cperc which is determined by an appro- 
priate ensemble average in the thermodynamic limit is the 
so-called percolation threshold which is in general lattice 
dependent. On a general network the corresponding phe- 
nomenon is the emergence of a giant connected component 
(GCC) i.e. a connected subnetwork which in the limit of 
an infinite network contains a finite fraction of the net- 
work. For a random graph where given vertices are linked 
at random this threshold has been shown to be reached at 
one bond per vertex [10] . However the distribution p{k) 
of the degrees k of vertices in a random graph is Pois- 
sonian. A more general criterion applicable to networks 
with given degree distribution p{k) but otherwise random 
linking between vertices has been proposed by MoUoy and 
Reed [111718] . For such equilibrium networks a GCC can 
be shown to be present if 
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{k{k - 2)} > 



(1) 



with the appropriate ensemble average (. . .) over networks 
with given degree distribution. Defining the MoUoy-Reed 
parameter as the ratio of the moments of the degree dis- 
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tribution 

^^"^ ^ {k')/{k) (2) 
the percolation threshold can then be determined by 

— 2 at Cpcrc- (3) 

Taken that for scale-free networks the degree distribution 
obeys power law scaling 

p{k) - k-f (4) 

one finds that the second moment (/c^) diverges for 7 < 3. 
Thus, the value 7 = 3 separates two difi^erent regimes for 
the percolation on equilibrium scale free networks [7]. In- 
deed, for infinite equilibrium scale-free networks k^^^ ^ 
remains finite for 7 > 3, however for 7 < 3 a GCC is 
found to exist at any concentration of removed sites: the 
network appears to be extremely robust to random re- 
moval of nodes. Therefore, observed transitions for real- 
world systems |2|3|4|5|6] from the theoretical standpoint 
may be seen as finite-size effects or resulting from essen- 
tial degree-degree correlations. The tolerance of scale-free 
networks to intentional attacks (when the highest degree 
nodes are removed) was studied in Ref. [12 . It was shown 
that even networks with 7 < 3 may be sensitive to inten- 
tional attacks. 

Obviously, the above theoretical results apply to ideal 
complex networks and for ensemble averages and may be 
confirmed within certain accuracy when applied to dif- 
ferent individual real-world networks. Not only finite-size 
effects are the origin of this discrepancy fT^. Furthermore, 
even networks of similar type (e.g. of similar node degree 
distribution and size) may be characterized by a large va- 
riety of other characteristics. While some of them may 
have no impact on the percolation properties [14j . others 
do modify their behavior under attack, as empirically re- 
vealed in Ref. ^15j for two different real- world scale- free 
networks (computer and collaboration networks). There- 
fore, an empirical analysis of the behavior of different real- 
world networks under attack appears timely and will allow 
not only to elaborate scenarios for possible defence mech- 
anisms of operating networks but also to create strategies 
of network constructions, that are robust to attacks of 
various types. 

In this paper, we present results of the analysis of the 
behavior of networks of public transport in large cities 
(public transport networks, PTNs) and consider attacks 
by various scenarios. To our knowledge the resilience of 
PTNs under attack has so far not been treated in terms 
of complex network concepts. Furthermore, in parallel we 
analyze a number of complex networks of the same type. 
Previous analysis usually focussed on a single instance of 
a network of given type JJjj. Our study intends to show 
that even within a sample of several networks that were 
created for the same purpose, namely PTNs, one may ob- 
serve essential diversity with respect to the behavior under 
attacks of various scenarios. 

As we have mentioned above, the attack resilience of 
a network may be tested within a variety of different at- 
tack scenarios. In a given one, a list of nodes ordered by 



decreasing degree may be prepared for the unperturbed 
network and the attack successively removes vertices ac- 
cording to this original list [17ll8j . In a slightly different 
scenario the vertex degrees are recalculated and the list 
is reordered after each removal step [5]. In initial stud- 
ies only little difference between these two scenarios was 
observed |8] , however further analysis showed |15|19j that 
attacks according to recalculated lists often turn out to 
be more harmful than the attack strategies based on the 
initial list, suggesting that the network structure changes 
as important vertices or edges are removed. Other scenar- 
ios consider attacks following an order imposed by other 
measures of the centrality of a node, e.g. the so-called be- 
tweenness centrality [15] . In particular for the world-wide 
airport network, it has been shown recently [20|21j that 
nodes with higher betweenness play a more important role 
in keeping the network connected than those with high 
degree. In our study, we will make use of the scenarios 
proposed so far as well as develop further algorithms to 
perform network attacks. 

The paper is organized as follows, in the next Sec- 
tion we describe the database, define observables in terms 
of which we are going to follow the changes in the net- 
work properties under attacks, and describe the different 
attack strategies we will use. We display our principal re- 
sults in sections m m There, we formulate criteria that al- 
low to estimate the resilience of networks against attacks 
and discuss behavior of the PTNs during attacks follow- 
ing different strategies, outlining the most effective ones. 
Conclusions and an outlook are given in Section O 

2 Databases, observables, and attack 
strategies 

This study continues our analysis of the properties of PTNs 
initiated in Refs. [22.23.24) . As in these works, we rely on 
the publicly available information about PTNs of a set of 
fourteen major cities of the world [25]. Our choice for the 
selection of these cities was motivated by the idea to col- 
lect network samples from cities of different geographical, 
cultural, and economical background. In Table [T] we give 
some information summarizing the empirical analysis of 
some of the properties of the PTNs under consideration. 

There are various ways to represent a PTN in terms 
of a graph [26j . These different representations allow for a 
comprehensive analysis of various PTN properties reflect- 
ing their operating functions. It is natural to perform the 
analysis of PTN attack resilience in terms of these rep- 
resentations. These are briefly summarized in Fig. [1] For 
the purpose of the present analysis, we will make use of 
the so-called IL and P-space graphs. In E-space represen- 
tation the PTN is represented by a graph with nodes 
that correspond to the stations, whereas links correspond 
to connections between stations within one stop distance 
(Fig. [T|d). In the P-space [27] all station-nodes that be- 
long to the same route form of a complete subgraph of 
the network (Fig.[Tl:). 

Let us take the L-space representation to introduce 
the observables we will use to quantify the PTN behavior 
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a. b. c. 

Fig. 1. (color online) a: a simple public transport map. Stations A-F are serviced by routes No 1 (shaded orange), No 2 (white), 
and No 3 (dark blue), b: IL-space graph, c: P-space graph, the complete sub-graph corresponding to route No 1 is highlighted 
(shaded orange). 
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Type 
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R 




/J max 


(^0 


ce. 






71. 


(fcp) 


t-p 




cp 




(k) 


7p 


Berlin 


BSTU 


2992 


211 


2.58 


68 


18.5 


52.8 


1.96 


3.16 


(4.30) 


56.61 


5 


2.9 


41.9 


11.47 


84.51 


(5.85) 


Dallas 


B 


5366 


117 


2.18 


156 


52.0 


55.0 


1.28 


2.35 


5.49 


100.58 


8 


3.2 


48.6 


11.23 


145.65 


(4.67) 


Diisseldorf 


BST 


1494 


124 


2.57 


48 


12.5 


24.4 


1.96 


3.16 


3.76 


59.01 


5 


2.6 


19.7 


10.56 


91.17 


(4.62) 


Hamburg 


BFSTU 


8084 


708 


2.65 


156 


39.7 


254.7 


1.85 


3.26 


(4.74) 


50.38 


11 


4.7 


132.2 


7.96 


79.43 


4.38 


Hong Kong 


B 


2024 


321 


3.59 


60 


11.0 


60.3 


3.24 


5.34 


(2.99) 


125.67 


4 


2.2 


11.7 


10.20 


232.73 


(4.40) 


Istanbul 


BST 


4043 


414 


2.30 


131 


29.7 


41.0 


1.54 


2.69 


4.04 


76.88 


6 


3.1 


41.5 


10.59 


140.13 


(2.70) 


London 


BST 


10937 


922 


2.60 


107 


26.5 


320.6 


1.87 


3.22 


4.48 


90.60 


6 


3.3 


90.0 


16.94 


166.95 


3.89 


Los Angeles 


B 


44629 


1881 


2.37 


210 


37.1 


645.3 


1.59 


2.73 


4.85 


97.99 


11 


4.4 


399.6 


17.21 


159.86 


3.92 


Moscow 


BEST 


3569 


679 


3.32 


27 


7.0 


127.4 


6.25 


7.91 


(3.22) 


65.47 


5 


2.5 


38.0 


26.48 


130.65 


(2.91) 


Paris 


BS 


3728 


251 


3.73 


28 


6.4 


78.5 


5.32 


6.93 


2.62 


50.92 


5 


2.7 


59.6 


24.06 


88.89 


3.70 


Rome 


BT 


3961 


681 


2.95 


87 


26.4 


163.4 


2.02 


3.67 


(3.95) 


69.05 


6 


3.1 


41.4 


11.34 


108.08 


(5.02) 


Sao Paolo 


B 


7215 


997 


3.21 


33 


10.3 


268.0 


4.17 


5.95 


2.72 


137.46 


5 


2.7 


38.2 


19.61 


333.73 


(4.06) 


Sydney 


B 


1978 


596 


3.33 


34 


12.3 


82.9 


2.54 


4.37 


(4.03) 


42.88 


7 


3.0 


33.6 


7.79 


74.63 


(5.66) 


Taipei 


B 


5311 


389 


3.12 


74 


20.9 


186.2 


2.42 


4.02 


(3.74) 


236.65 


6 


2.4 


15.4 


12.96 


415.46 


(5.16) 



Table 1. Some characteristics of the PTNs analyzed in this study. Types of transport taken into account: Bus, Electric trolleybus. 
Ferry, Subway, Tram, Urban train; A'^: number of stations; R: number of routes. The following characteristics are given in L- 
and P-spaces, as indicated by the subscripts: (k) (mean node degree); l"^'^^^ (^I'j (maximal and mean shortest path length); c 
(relation of the mean clustering coefficient to that of the classical random graph of equal size); k'*' (c.f. Eqs. ((Sjl, (|14p '): 7 
(an exponent in the power law Q fit, bracketed values indicate less reliable fits, see text). More data is given in [23) . 



under attack. Keep in mind however, that in our analysis 
presented in Section [3] we will deal also with the P-space. 
There are two intrinsically connected questions that natu- 
rally arise when one wants to describe quantitatively how 
a certain network changes when its nodes are removed [28] . 
The first is how to choose the 'order-parameter' variable 
that signals the quantitative change in the network be- 
havior (i.e. the break down of the network), the second is 
how to locate the value of concentration of removed nodes 
at which this change occurs. As we have mentioned in the 
introduction, in a theoretical description a useful quan- 
tity is the GCC: its disappearance can be associated with 
a network breakdown. Strictly speaking, the GCC is well- 
defined only in the N 00 limit, therefore in practice 
dealing with a network of a finite size N it is substituted 
by the size of the largest connected component. We will 
use in the following its normalized value defined by: 

S = Ni/N, (5) 

with N and Ni being number of nodes of the network and 
of its largest component correspondingly. By definition 
([5|). a largest component is always present in a network 



of non-zero size. A useful quantity to measure network 
connectivity is the average shortest path: 

^ ' ■i>j 

where is the length of a shortest path from node i 

to j and the sum spans all pairs i,j of sites of the network. 
However, {£) is ill-defined for a disconnected network. Al- 
ternatively, one can suitably define the mean inverse short- 
est path length [T^ by: 

with £~^{i,j) = if nodes i,j are disconnected. As one 
can see, Eq. (O is well-defined even for a disconnected 
network and as such can be used to trace changes of net- 
work behavior under attack. To give an example, we show 
in Fig. [2] how the largest component fraction S, Eq. ([5]) 
and the mean inverse shortest path length {£^^), Eq. 
change upon random removal of nodes in each of fourteen 
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Fig. 2. (color online). IL-space. Random scenario. Size of the largest cluster S (a.) and an average inverse mean shortest path 
length {£^^) (b.) as functions of a fraction of removed nodes c normalized by their values at c = 0. 



PTNs selected for our study. More precisely, we measure 
these quantities as functions of the fraction of removed 
nodes c starting from the unperturbed network (c = 0) 
and eliminating at random step-by-step 1 % of the nodes 
up to c = 1. In what follows below we will call this scenario 
a random scenario. 

Already this first attack attempt brings about inter- 
esting (and in part unexpected) PTN features. Namely: 

(i) different PTNs react on random removal of their nodes 
in different ways, that range from rapid abrupt breakdown 
(Dallas) to a slow almost linear decrease (Paris); 

(it) although qualitatively similar, the observed impact of 
the attack differs depending on which variable is used as 
indicator, either 5 or (i^^)- Ordering the PTNs by their 
vulnerability, this order may thus differ depending on the 
applied indicator; 

(iii) up to c = 1, there is no general 'percolation threshold' 
concentration of removed nodes c at which S (or {£~^)) 
vanishes that would hold for all PTNs. Rather for some 
individual PTNs one observes various values of c at which 
these PTNs show abrupt changes of their properties. 

Figs. [2] a,b display how the different PTNs react on 
a random removal of their nodes. Obviously, the question 
immediately arises how this behavior changes if one re- 
moves the nodes not at random, but following a given or- 
der or scheme (we call this the scenario of the attack). As 
we have mentioned in the introduction, a number of differ- 



as follows (see e.g. PSI): 
CcU) - 

CgU) = 
Csij) = 



1 



1 



maxteA^^(j, t) ' 



E 



CbU) = 



(8) 

(9) 

(10) 

(11) 



In Eqs. (|5))- (fTni) . i{j,t) is the length of a shortest path 
between the nodes j, t that belong to the network M, ast is 
the number of shortest paths between the two nodes s, i € 
A/", and <Jst{j) is the number of shortest paths between 
nodes s and t that go through the node j. Alternatively, 
one may measure the importance of a given node j by 
the number of its second nearest neighbors 22 (j) or its 
clustering coefficient C(j). The latter is the ratio of the 
number of links Ej between the kj nearest neighbors of j 
and the maximal possible number of mutual links between 
them: 



C{J) 



2E, 



kj{kj 



1 



(12) 



Removing important nodes according to lists prepared 
in the order of decreasing node degrees fc, centralities ([5])- 
pip. number of their second nearest neighbors 22, and 
ent attack scenarios have been proposed |2|8|15|17|18|19|20|21ffi3l] easing clustering coefficient C defines seven different 



These are generally based on the intuitive assumption that 
the largest impact on a network is caused by the removal 
of its most 'important' nodes. A number of indicators have 
been developed in particular in applications of graph the- 
ory for social science to measure the importance of a node. 
Besides the node degree kj, which is equivalent to the 
number of nearest neighbors zi (j) of a given node j, differ- 
ent centralities have been introduced for this purpose. In 
particular, the closeness Cc{j), graph Ccii), stress Cs{j), 
and betweenness centralities CB{j) of a node j are defined 



attack scenarios. As we have already mentioned in the 
introduction, the scenarios can be either implemented ac- 
cording to lists prepared for the initial PTN before the 
attacks (we will indicate the corresponding scenario by a 
superscript i, e.g. C^) or by fists rebuilt by recalculating 
the order of the remaining nodes after each step. Together, 
this leads to fourteen different attack scenarios. In addi- 
tion, we will keep the above described random scenario 
(denoted further as RV) and add one scenario more, re- 
moving a randomly chosen neighbor of a randomly chosen 
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Fig. 3. (color online). Largest component size of the PTN of 
Paris as function of the fraction of removed nodes for different 
attack scenarios. Each curve corresponds to a different sce- 
nario as indicated in the legend. Lists of removed nodes were 
prepared according to their degree k, closeness Cc, graph Co, 
stress Cs, and betweenness Cb centralities, clustering coeffi- 
cient C, and next nearest neighbors number Z2. A superscript 
i refers to lists prepared for the initial PTN before the attack. 
RV and RN denote the removal of a random vertex (RV) or of 
its randomly chosen neighbor (RN), respectively. 

node (RN). The latter scenario appears to be effective for 
immunization problems |30j and it is based on the fact, 
that in this way nodes with a high number of neighbors 
will be selected with higher probability. Note that in this 
scenario only a neighbor node is removed and not the ini- 
tially chosen one. 

All together, this defines sixteen different scenarios to 
attack a network and we apply these to all fourteen PTNs 
that form our database. A typical result for a single PTN 
is displayed in Fig. [3l Here, we show how the largest con- 
nected component size S of the Paris PTN changes under 
the influence of the above described attack scenarios. Al- 
ready from this plot one may discriminate between the 
most effective scenarios that result in a fast decrease of 
the largest component size (those governed by between- 
ness and stress centralities, node degree, and next nearest 
neighbors number - see the Figure) and the less harmful 
ones. In the following, instead of displaying the results of 
all attacks for all different PTNs we will focus on the re- 
sults of the most effective scenarios comparing them with 
those of random failure as introduced by the random sce- 
nario. As outlined in the introduction, we make use of dif- 
ferent PTN representations (different 'spaces' of Fig. [1]). 
In the following section, we present the analysis of PTN 
resilience in the E-space representation. 

3 Results in L-space 

The L-space representation of a PTN is a graph that rep- 
resents each station by a node, a link between nodes indi- 
cates that there is at least one route that services the two 
corresponding stations consecutively. No multiple links are 
allowed (see Fig. [TJd). Therefore, attacks in the L-space 



correspond to situations, in which given public transport 
stations cease to operate for all means of traffic that go 
through them. Note however, that in this representation, 
the removal of a station node does not otherwise interfere 
with the operation of a route that includes this station. 
It rather splits this route into two (operating) pieces. An 
alternative situation will be considered in the forthcoming 
section. 

In order to answer some of the questions raised in Sec- 
tion [21 let us return to Fig. [31 where the impact on the 
largest component size S of the PTN of Paris is shown 
for sixteen different attack scenarios as function of the 
fraction of removed nodes. As we have already remarked, 
for this PTN the most influential are the scenarios where 
nodes are removed according to lists ordered by Cb, k, 
Cs, C^, Cg (we list the characteristics in a decreas- 
ing order of effectiveness of the corresponding scenario). 
For a small value of c (c < 0.07) these scenarios cause 
practically indistinguishable impact on S with a linear 
behavior S" (1 — c). As c increases, deviations from the 
linear behavior arise and the impact of different scenarios 
start to vary. In particular, there appear differences be- 
tween the role played by the nodes with highest value of k 
and highest betweenness centrality Cb- Whereas the first 
quantity is a local one, i.e. it is calculated from properties 
of the immediate environment of each node, the second 
one is global. Moreover, the fc-based strategy aims to re- 
move a maximal number of edges whereas the C^-based 
strategy aims to cut as many shortest paths as possible. 
In addition, there arise differences between the 'initial' 
and 'recalculated' scenarios, suggesting that the network 
structure changes as important nodes are removed. Simi- 
lar behavior of S{c) is observed for all PTNs included in 
this study, with certain peculiarities in the order of effec- 
tiveness of different attack scenarios. Note however, that 
the difference between 'initial' and 'recalculated' scenarios 
is less evident for strategies based on local characteristics, 
as e.g. the node degree or the number of second nearest 
neighbors (c.f. curves for k, fc* and Z2, z^, respectively). 
This difference between initial and recalculated charac- 
teristics is even more pronounced for the centrality-based 
scenarios. 

Now let us return to some of the observations of Sec- 
tion [2l Namely, we noted that the observed impact of an 
attack may differ depending on which observable is used 
as the 'order-parameter' variable (c.f. Fig. [21 where this is 
shown for the RV attack scenario taking either S or 
as 'order-parameter'). Similar differences we observe also 
in the case of the other scenarios. For the sake of unique- 
ness in the following we will use the value of S to measure 
the effectiveness of a given attack. This choice is moti- 
vated by several reasons: (i) in an infinite network limit 
S defines an order parameter of the classical percolation 
problem [S]; (ii) differences between network resilience as 
judged e.g. by the behavior of S or by that of are 
not significant enough to be a subject of special analysis 
(at least not for the PTNs we consider); (iii) considering S 
naturally leads to other useful characteristics that allow to 
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a. b. 

Fig. 4. (color online). IL-space. Recalculated highest degree scenario, a. behavior of the maximal shortest path ^max for the 
PTNs of Paris and London. Note the characteristic peaks that occur at c — 0.13 (Paris) and c = 0.06 (London), b. Size of 
largest connected cluster S as function of a fraction of removed nodes for the same networks. The arrows indicate the values of 
c at which the peak for £ max appears. 



estimate the PTN operating ability and its segmentation. 
Let us stop to elaborate the latter point in more detail. 

As we have already emphasized, there is no well defined 
'percolation threshold' concentration of removed nodes 
Cperc at which S (or vanishes (see Figs. [U [3]) which 

could serve as evidence of a break down of the largest 
PTN component and hence of the loss of operating ability 
[31] . In Ref. [21] it has been proposed to use the behavior 
of maximal shortest path length ^max as a possible indi- 
cator of the network break down. This was based on the 
observation, that as the concentration of removed nodes 
c increases, the value of imax for different PTNs displays 
similar typical behavior: initial growth and then an abrupt 
decrease when a certain threshold is reached (see e.g. Fig. 
[4] a where this value is shown for the recalculated highest 
degree attack scenario of the PTNs of Paris and London). 
Obviously, removing the nodes initially increases the path 
lengths as deviations from the original shortest paths need 
to be taken into account. Further removing nodes then 
at some point leads to the breakup of the network into 
smaller components on which the paths are naturally lim- 
ited by the size of these components which explains the 
sudden decrease of their lengths. For comparison, in Fig. 
H] b we show how the value of S changes under the recal- 
culated highest degree scenario for the above PTNs. 

Being certainly useful for many instances of the PTNs 
analyzed, the above fmax-based criterion cannot serve as 
an universal tool to determine the region of c, where the 
network stops to operate. One of the reasons is that for 
certain PTNs (as well as for certain attack scenarios) we 
have found that £max does not show a pronounced maxi- 
mum, but rather shows several maxima at different values 
of c. Therefore, to devise a criterion which may be equally 
well used for any of the networks we decided to define 
characteristic concentration of removed nodes Cg at which 
the size of the largest component S decreases to one half 
of its initial value. This characteristic concentration al- 
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Fig. 5. (color online). L-space. Random scenario. Size of the 
largest cluster S normalized by its value at c = as function 
of a fraction of removed nodes. From this figure it is easy to 
define the fraction of nodes Cs which satisfies Eq. (|13|l . 

lows us to compare the effective robustness of different 
PTNs or of the same PTN when different attack scenar- 
ios are applied. In what follows below, we will call this 
concentration the segmentation concentration Cg, with the 
obvious condition: 

S{cs) = \s{c = 0). (13) 

In Fig. [5] we plot the size of the largest connected com- 
ponent S for different PTNs as function of the fraction of 
removed nodes c for the random vertex scenario (RV) in 
IL-space. The choice of the lowest S value 5* = 1/2 in this 
figure enables one to find the value Cg as the crossing point 
of S{c) with the horizontal axis. The values of Cg obtained 
for this scenario are given in the last column of Table [3 
Note that the PTNs under consideration react on ran- 
dom attack in many different ways: some of them slowly 
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Table 2. Segmentation concentration Cs for different attack scenarios applied to different PTNs. For each city, the Table 
displays the results of the five most destructive attack scenarios ordered by increasing values of Cs. The scenario is indicated 
after corresponding value of Cs. The scenarios are abbreviated by the name of the characteristics used to prepare the lists of 
removed nodes (see Sec. [2] for detailed explanation). In the last column the value of Cs for the random scenario (RV) is shown. 



decrease without any abrupt changes in S (like PTNs of 
Paris, Moscow, Sydney) while others are characterized by 
rather fast decay of S (Dallas, Los Angeles, Istanbul). 

Now, applying these attacks according to the sixteen 
scenarios described above we are in the position to dis- 
criminate them by their degree of destruction and to sin- 
gle out those with the highest impact on each of the PTNs 
considered. To this end, for each PTN we give in Ta- 
ble [5] the segmentation concentration Cg for the five most 
harmful attack scenarios. The obtained values of Cs are 
given in increasing order. Near each value we denote the 
scenario that was implemented. Our analysis reveals the 
most harmful scenarios as those targeted at nodes with the 
highest values of either the node degree fc, the between- 
ness centrality Cb, the next nearest neighbor number Z2, 
or the stress centrality Cs recalculated after each step of 
the attack. 

It is instructive to observe correlations between the 
characteristics of unperturbed PTNs (see Table [T|) and 
their robustness to attacks. Such correlations may allow 
for an a priory estimate of the resilience of a network with 
respect to attacks. As discussed in the introduction, per- 
colation theory for uncorrelated networks predicts that 
the value of the MoUoy-Reed parameter k^''\ Eq. can 
be used to measure the distance to the percolation point 
^(fc) _ 2, We may therefore expect that networks with a 
higher value of k^'^-' show higher resilience. To this end let 
us first compare the values of Cg for certain scenarios with 
the value of k^''^ for the unperturbed PTN. Before doing 
this let us note that for an uncorrelated network the value 
of k'-'^-' can be equally represented by the ratio between 
the mean next neighbors number of a node zi (which is 
by definition equal to the mean node degree (fc)) and the 
mean second nearest neighbors number Z2'. 

= Z2/Z1. (14) 
Indeed, given that for such a network (see e.g. [T]) 

Z2={k^)-{k), (15) 



one can rewrite ^ as: 

K^'^ = 1 at Cperc. (16) 

The relation k^*"'' = k'^' -t- 1 holds only approximately 
for the real-world networks we consider in our study, as 
one can see, e.g., from the Table [T] In Fig. [5^ we com- 
pare both quantities k^''\ k^^^ for unperturbed PTNs with 
the corresponding segmentation concentration Cg for the 
random attack scenario. Within the expected scatter of 
data one can definitely observe a general tendency of Cg 
to increase with both k'^'^^ and k^^^: the higher the value 
of K for an unperturbed network, the more robust it is 
to random removal of its vertices. This conclusion, how- 
ever with a more pronounced scatter of data even holds if 
one repeats the same analysis for the case of the scenario 
based on recalculated node degrees, as shown in Fig. [5Jd. 
Again, one observes Cg to increase with increasing k. For 
the betweenness-based attack scenarios the data is even 
more scattered and a prediction based on the a priori cal- 
culated ratios is unreliable. 

Another useful observation concerns the correlation 
between the PTN attack resilience and the node-degree 
distribution exponent 7 (|3|). As we have observed in the 
previous studies [22l23j some of the PTNs under consid- 
eration are scale- free: their node-degree distributions have 
been fitted to a power-law decay (H]) with the exponents 
shown in Table [1] Others are characterized rather by an 
exponential decay, but up to a certain accuracy they can 
also be approximated by a power-law behavior (then, the 
corresponding exponent is shown in Table [1] in brackets). 
In Fig. [7^ we show the correlation between the fitted node- 
degree distribution exponent 7 and Cg for the random at- 
tack scenario. Filled circles correspond to scale- free PTNs, 
open circles correspond to the PTNs where the scale-free 
behavior is less pronounced. It is interesting to observe, 
that even if we include the PTNs which are better de- 
scribed by the exponential decay of the node-degree dis- 
tributions, there is a notable tendency to find PTNs with 
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Fig. 6. L-space. Correlations between the ratio k, Eq. ((3|, (|14p and segmentation concentration Cs. Open circles: k'*^ = (A:^) / (fc), 
filled circles: k'^-* = zij z\. The lines serve as guides to observe the tendency of Cs to increase for higher values of k. a. Random 
scenario. Most out-of-range are the points Cs = 0.35, k^^' = 2.54, k^''^ = 4.37 (Sydney) and Cs = 0.35, k'^' = 6.25, k''-'' = 7.91 
(Moscow), b. Recalculated node-degree scenario. Two PTNs are out of range: Cg = 0.04, K^^'^ = 4.17, = 5.95 (Sao Paolo) 
= 6.25, = 7.91 (Moscow). 





Fig. 7. IL-space. Correlations between the node-degree distribution exponent 7 and segmentation concentration Cs ■ Filled circles: 
scale-free PTNs, open circles: PTNs with less pronounced power-law decay. Solid lines serve as guides to observe the tendency 
of Cs to decay with an increase of 7. a. Random scenario. Most out of range are the points at Cs = 0.24, 7 = (5.16) (Taipei) and 
at Cs — 0.35, 7 — 4.03 (Sydney), b. Recalculated node-degree scenario. Most out of range are the points at Cs = 0.04, 7 = 2.72 
(Sao Paolo) and at Cs — 0.115, 7 — (5.16) (Taipei). 



smaller values of 7 to be more resilient as indicated by 
larger values of Cg. This tendency is again confirmed if one 
considers the recalculated node degree attack scenario, as 
shown in Fig. [7)3. 

The above observed correlation between the exponent 
7 that characterizes the unperturbed network (i.e. a PTN 
at c = 0) and the segmentation concentration Cg at which 
however the PTN is to a large part unperturbed indicates 
that some global properties of the node-degree distribu- 
tion may remain essentially unchanged when the nodes 
are removed (i.e. a scale-free distribution remains scale- 
free as c increases, < c < Cg). To check that assumption 
for the RV scenario, we analyzed the averaged cumulative 
node degree distributions for each of the PTNs with 3,5, 
and 10 % of removed nodes. The cumulative distribution 



P{k) is defined in terms of the node-degree distribution 
p{q) dl as: 

P{k)=Y,p{q), (17) 

q—k 

with k™'^^ the maximal node degree in the given PTN. 
Typical results of this analysis are shown in Fig. [51 for 
the PTN of Paris. We compare the cumulative node de- 
gree distribution P{k) of the unperturbed PTN with that 
of the PTN where a given fraction c part of the nodes 
(c — 0.03, 0.05, and 0.1, correspondingly) was removed 
according to the random attack scenario (RV). For each 
of the concentrations of the removed nodes, P{k) was av- 
eraged over 2000 repeated attacks. 
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Fig. 8. (color online). IL-space. Average cumulative node degree distributions for Paris PTN for the random attack scenario. 
Comparison of the initial distribution (red curve, c — 0) with those of the PTNs with c = 0.03, c = 0.05, c = 0.1 (a). Average 
cumulative node degree distribution together with statistical errors for c = 0.03 (b), c = 0.05 (c), c = 0.1 (d). 



In the first plot, Fig. [8h, we compare the three result- 
ing average distributions (for c — 0.03, 0.05, and 0.1) with 
the original one (c = 0). One clearly sees that there is 
no qualitative or even quantitative (change of exponent) 
change of the distributions for any of the three cases. In- 
deed, if one has a large set of nodes with a given node- 
degree distribution any sufficiently large random subset of 
these nodes should have the same distribution; in partic- 
ular this holds if one averages these subset distributions 
over many instances. The above argument seems to ig- 
nore the change of degrees in the subset due to cutting off 
those vertices not remaining in the set. However, due to 
the random choice of the removed nodes the share of lost 
degree will on the average be proportional to the degree 
of each vertex: the higher its degree the more probable 
it is that one of its neighbors is chosen to be removed 
and this probability is proportional to its degree. Thus, 
the sum of degrees in the remaining subset is lower; but 
the degree distribution P{k) is effectively transformed to 
P'(cfc) = nP{k) where c is the probability of any node be- 
ing removed and P'{k) is the distribution in the remain- 



ing subset of nodes, n a normalization. For an exponential 
distribution this transformation shifts the scale. However, 
a scale free distribution keeps its exponent under such a 
transformation. 

In the other three plots. Figs. [Da-d we show for each 
amount of removed nodes the average cumulative distri- 
bution together with statistical errors calculated as the 
standard deviation within the ensemble of the 2000 in- 
stances generated in the sample. Even on the logarithmic 
scale these are very small for all but the very high degrees 
where fluctuations of small numbers of often less than one 
node for a given degree occur. 



4 Results in P-space 

Let us complement the L-space analysis performed above 
by observing the reaction of PTN graphs under attack 
when one observes them in another representation. In par- 
ticular, we will investigate P-space graphs. 
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First let us recall that in this representation each node 
corresponds to a PTN station, i.e. it has the same interpre- 
tation as in the IL-space. However, the interpretation of a 
link differs from that in the L-space: now all station-nodes 
that belong to the same route are connected and thus each 
route enters the P-space network as a complete subgraph. 
This results in the main peculiarity of the interpretation 
of the behavior under attacks of these graphs. Consider as 
an example the P-space graph of Fig. [Tt and compare it 
to the original PTN map, Fig.[T^. Whereas the removal of 
station node C in the map (Fig.[T^) disconnects the nodes 
B and D, the removal of the same node in the P-space (Fig. 

keeps nodes B and D connected, as far as they still be- 
long to the same route. Therefore, the removal of nodes in 
P-space, performed either in a random way or according 
to certain lists, has a different interpretation in compari- 
son to that occurring in the 1-space. An interpretation of 
the removal of nodes in P-space is the following: if a node 
is removed, the corresponding stop of the route is can- 
celed while the route otherwise keeps operating. If in the 
above example the station-node C is removed, route No 
2 still keeps operating and station-node B can be reached 
from D, only without stopping at C (e.g. the bus takes a 
shortcut). In this way, as we will see below, the removal of 
nodes in P-space allows us to gain additional insight into 
the PTN structure. 

As in the case of the IL-space representation, we study 
the resilience of the P-space PTN graphs to attacks per- 
formed following the sixteen different scenarios defined in 
Section [3l In Fig. [9] we show the change of the size of the 
largest cluster S (a) and the average inverse mean short- 
est path length (b) under random attacks (RV). If 
one compares this behavior with that observed for the RV 
scenario in P-space (see Fig. [2|) one sees, that all PTNs 
under consideration react in a much more homogeneous 
way. In P-space random attacks lead to changes of the 
largest connected component S that range from an abrupt 
breakdown (Dallas) to a slow smooth decrease (Paris). In 
P-space one observes for the same scenario only a decrease 
of S which corresponds to the number of removed nodes. 



No break-down of this cluster occurs in this scenario. The 
value of S{cs) defined by the condition (fT3|) is given in the 
last column of Table [H It is worth to note, that the be- 
havior of the mean inverse shortest path length as 
function of the fraction c of disabled nodes is also qualita- 
tively different between the two RV scenarios in P- (Fig. 
[2}d) and P- (Fig. [9)d) spaces. In P-space decreases 
in general faster than linearly indicating an increase of 
the path length between the nodes as well as partitioning 
of the network. In P-space remains for a large part 

unperturbed as the nodes of the complete subgraph re- 
main essentially connected and the shortest path lengths 
remain almost unchanged until only a small fraction of the 
network remains. 

To further detail the situation, similar as in Section 
m we summarize in Table [3] the outcome of the five most 
harmful attack scenarios and compare those with the ran- 
dom attack scenario. As it follows from the Table and as 
is further supported by Fig. [TOl the bctweenness-targeted 
scenarios appear to be the most harmful. Following this 
observation let us investigate the role of the highest be- 
tweenness nodes: above all these are the nodes (and not 
the highest-fc hubs) that control the PTN behavior un- 
der attack. The P-space degrees of these high-betweenness 
nodes do not essentially differ from those of the hubs, 
therefore they cannot be easily distinguished from the 
other nodes during attacks according to highest-fc sce- 
nario. To support this assumption, let us recall that in 
the P-space representation each route enters the overall 
network as a complete subgraph, with all nodes intercon- 
nected. Removing nodes from a complete graph does not 
lead to any segmentation. The decrease of the normal- 
ized size of this graph will be given by the exact formula 
S = 1 — c (which is - almost - reproduced by the RV sce- 
nario, c.f. Fig. [9h)- Under such circumstances a special 
role is played by those nodes that join different complete 
graphs (different routes). The removal of such nodes will 
separate different complete routes and as a result may lead 
to network segmentation. Naturally, being between differ- 
ent complete subgraphs such nodes are characterized by 
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Fig. 10. (color online). P-space, size of the largest cluster S at a: highest degree scenario (recalculated), b: highest betweenness 
scenario (recalculated). 
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Table 3. Segmentation concentration Cs for different attack scenarios applied to different PTNs in P-space. For each city, 
the Table shows the five most effective attack scenarios ordered by increasing values of Cg. The scenario is indicated after 
corresponding value of Cg. The scenarios are abbreviated by the name of the characteristics used to prepare the lists of removed 
nodes (see Sec. [2] for detailed explanation). In the last column the value of Cs for the random scenario (RV) is shown. 



high centrality indices, as observed above. Moreover, as 
far as their direct neighbors belong to different complete 
graphs, these neighbors are not connected between each 
other resulting in a lower value of the clustering coefficient 
C. From Table [3] one sees that attacks based on choosing 
nodes with low-C values are very effective in P-space. 

To conclude this section, we ask the question if a sim- 
ple criterion can be found that allows to predict a priori 
the P-space PTN vulnerability. Namely, given the gen- 
eral PTN characteristics (see Table [ij can one forecast 
resilience against attacks in P-space? The answer is given 
by the observation that the networks with low mean short- 
est path length (£p) are the best connected in P-space and 
hence may be expected to be less vulnerable. Indeed, on 
the one hand, for the above example of a complete graph 
(a single PTN route) (^p) = 1 and it is extremely robust 
to P-space attacks. On the other hand, a high value of 
(^p) indicates numerous intermediate nodes between dif- 
ferent routes. As we have checked above, the targeted re- 



moval of such nodes leads to rapid network segmentation. 
In support of the above reasoning, in Fig. [TT] we plot Cs 
as function of (^p) for attacks based on the highest be- 
tweenness centrality scenario. There, within the expected 
scatter of data one observes a clear evidence of the de- 
crease of Cs with (^p), i.e. networks with higher mean path 
length break down at smaller values of c and arc thus more 
vulnerable. 

It is worth to note here, that in P-space it is only 
the RV attack that has very similar impact on all PTNs 
(see Fig. [9|). As we have just observed, similar to the IL- 
space also in P-space the PTNs manifest different level of 
robustness against attacks targeted on the most important 
nodes. However, the order of vulnerability changes if one 
compares the outcome of the P-space and P-space attacks. 
This means that PTNs that were vulnerable in the P- 
space may appear to be robust against attacks in P-space. 
From Table [3] we see that the PTNs that are most stable 
against highest Cs-targeted attacks in P-space are the 
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Fig. 11. P-space. Correlations between the mean shortest path 
length (^p) and segmentation concentration Cs in the highest 
betweenness centrality scenario. The line serves as a guide to 
observe the tendency of Cs to decrease with increasing (^p). 

PTNs of Hong Kong, Sao Paolo, and Moscow, with Cg = 
0.285, 0.205, and 0.175, correspondingly. When attacked 
in L-space, the PTN of Moscow keeps its robustness: Cg = 
0.07 during Cs-targeted attack, which is one of highest Cs 
values for the IL-space, see Table [2] This is however not 
the case for the PTNs of Hong Kong and Sao Paolo. In 
IL-space, these belong to the most vulnerable PTNs. 



5 Conclusions and outlook 

In this paper, we have studied the behavior of city pub- 
lic transportation networks (PTNs) under attacks. In our 
analysis we have examined PTNs of fourteen major cities 
of the world. The principal motivation behind this study 
was to observe the behavior under attack of a sample of 
networks that were constructed for the same purpose, to 
compare these with available analytical results for perco- 
lation of complex networks, and possibly to derive some 
conclusions about correlations between PTN characteris- 
tics calculated a priory and the resilience to attacks. Fur- 
thermore, the resilience behavior of a network against dif- 
ferent attack scenarios gives additional insight into the 
network architecture, discovering structures on different 
scales. This approach has been termed the 'tomography' 
of a network [l4j . 

In our study we have also attempted to compare our 
results with the predictions of percolation theory on net- 
works. Due to the sizes of these systems which are far from 
the thermodynamic limit and the rather small sample 
of networks no quantitative comparison appeared possi- 
ble. However, qualitative predictions about the location of 
segmentation thresholds and thus the vulnerability could 
be verified. Although our study was not primarily moti- 
vated by applications, some of the results and methods 
developed within this study may be useful for planning 
and risk assessment of PTNs. Our analysis has identified 
PTN structures which are especially vulnerable and oth- 
ers, which are particulary resilient against attacks. Further 



investigation of other relevant network properties may re- 
veal mechanisms behind this structural resilience [32^ . Fur- 
thermore we note that the methods developed here also 
allow to identify minimal strategies to obstruct the opera- 
tion of the PTN of a city e.g. for the purposes of industrial 
action and possibly achieve a successful end of a social 
conflict. 

To analyze PTN resilience we have applied different 
attack scenarios, that range from a random failure to a 
targeted destruction, when the most influential network 
nodes were removed according to their operating char- 
acteristics. To choose the most influential nodes, we have 
used different graph theoretical indicators and determined 
in such a way the most effective attack scenarios. By our 
paper we show that even within a sample of networks that 
were created for the same purpose one observes essential 
diversity with respect to their behavior under attacks of 
various scenarios. Results of our analysis show that PTNs 
demonstrate rich variety of behavior under attacks, that 
range from smooth decay to abrupt change. 

As shown by our study, the impact of attacks may be 
measured by different quantities. As a criterion that is 
well defined and easily reproducible we choose to define 
the segmentation concentration Cg to correspond to the 
situation where the largest remaining cluster contains one 
half of the original nodes of the network. Let us note as 
well, that definitely not all of the PTNs analyzed demon- 
strated scale-free behavior in P-space (and even less in 
IL-space). Nevertheless, in spite of the diversity of behav- 
ior we clearly see common tendencies in their reaction to 
attacks. In particular, this enabled us to propose criteria 
that allow an a priori estimate of PTN robustness. In IL- 
space resilience is indicated by a high value of the MoUoy- 
Reed parameter k, Eqs. (fT4|) or by a small value of the 
exponent 7, if a power law is observed for the PTN node 
degree distribution, in P-space high resilience is indicated 
by a small mean shortest path length (^p). 

One of possible continuations of our study will be the 
analysis of PTN resilience in other graph representations, 
than those that were described above. 
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